Locality-sensitive binary codes from shift-invariant kernels
نویسندگان
چکیده
This paper addresses the problem of designing binary codes for high-dimensional data such that vectors that are similar in the original space map to similar binary strings. We introduce a simple distribution-free encoding scheme based on random projections, such that the expected Hamming distance between the binary codes of two vectors is related to the value of a shift-invariant kernel (e.g., a Gaussian kernel) between the vectors. We present a full theoretical analysis of the convergence properties of the proposed scheme, and report favorable experimental performance as compared to a recent state-of-the-art method, spectral hashing.
منابع مشابه
Binary Embedding with Additive Homogeneous Kernels
Binary embedding transforms vectors in Euclidean space into the vertices of Hamming space such that Hamming distance between binary codes reflects a particular distance metric. In machine learning, the similarity metrics induced by Mercer kernels are frequently used, leading to the development of binary embedding with Mercer kernels (BE-MK) where the approximate nearest neighbor search is perfo...
متن کاملDiffusion Hashing
With the worldwide spread of the broadband Internet, massive multimedia data including texts, images, and videos are increasing explosively and available for interactive applications over the Internet. At the same time, more and more attention has been paid to aiming at fast retrieval from massive multimedia databases. Hash-based Approximate Nearest Neighbor (ANN) search is a technology that ac...
متن کاملBinary Images of Z2Z4-Additive Cyclic Codes
A Z2Z4-additive code C ⊆ Z α 2 ×Z 4 is called cyclic if the set of coordinates can be partitioned into two subsets, the set of Z2 and the set of Z4 coordinates, such that any cyclic shift of the coordinates of both subsets leaves the code invariant. We study the binary images of Z2Z4-additive cyclic codes. We determine all Z2Z4-additive cyclic codes with odd β whose Gray images are linear binar...
متن کاملBinary Codes with Locality for Four Erasures
In this paper, binary codes with locality for four erasures are considered. An upper bound on the rate of this class of codes is derived. An optimal construction for codes meeting the bound is also provided. The construction is based on regular bipartite graphs of girth 6 and employs the sequential approach of locally recovering from multiple erasures. An extension of this construction that gen...
متن کاملLocality Sensitive Hashing Based Clustering
Definition In learning systems with kernels, the shape and size of a kernel plays a critical role for accuracy and generalization. Most kernels have a distance metric parameter, which determines the size and shape of the kernel in the sense of a Mahalanobis distance. Advanced kernel learning tune every kernel’s distance metric individually, instead of turning one global distance metric for all ...
متن کامل